principled framework
A principled framework for uncertainty decomposition in TabPFN
Fortini, Sandra, Ng, Kenyon, Petrone, Sonia, Rousseau, Judith, Wei, Susan
TabPFN is a transformer that achieves state-of-the-art performance on supervised tabular tasks by amortizing Bayesian prediction into a single forward pass. However, there is currently no method for uncertainty decomposition in TabPFN. Because it behaves, in an idealised limit, as a Bayesian in-context learner, we cast the decomposition challenge as a Bayesian predictive inference (BPI) problem. The main computational tool in BPI, predictive Monte Carlo, is challenging to apply here as it requires simulating unmodeled covariates. We therefore pursue the asymptotic alternative, filling a gap in the theory for supervised settings by proving a predictive CLT under quasi-martingale conditions. We derive variance estimators determined by the volatility of predictive updates along the context. The resulting credible bands are fast to compute, target epistemic uncertainty, and achieve near-nominal frequentist coverage. For classification, we further obtain an entropy-based uncertainty decomposition.
Exploring a Principled Framework for Deep Subspace Clustering
Meng, Xianghan, Huang, Zhiyuan, He, Wei, Qi, Xianbiao, Xiao, Rong, Li, Chun-Guang
Subspace clustering is a classical unsupervised learning task, built on a basic assumption that high-dimensional data can be approximated by a union of subspaces (UoS). Nevertheless, the real-world data are often deviating from the UoS assumption. To address this challenge, state-of-the-art deep subspace clustering algorithms attempt to jointly learn UoS representations and self-expressive coefficients. However, the general framework of the existing algorithms suffers from a catastrophic feature collapse and lacks a theoretical guarantee to learn desired UoS representation. In this paper, we present a Principled fRamewOrk for Deep Subspace Clustering (PRO-DSC), which is designed to learn structured representations and self-expressive coefficients in a unified manner. Specifically, in PRO-DSC, we incorporate an effective regularization on the learned representations into the self-expressive model, prove that the regularized self-expressive model is able to prevent feature space collapse, and demonstrate that the learned optimal representations under certain condition lie on a union of orthogonal subspaces. Moreover, we provide a scalable and efficient approach to implement our PRO-DSC and conduct extensive experiments to verify our theoretical findings and demonstrate the superior performance of our proposed deep subspace clustering approach. The code is available at https://github.com/mengxianghan123/PRO-DSC.
A Principled Framework for Evaluating on Typologically Diverse Languages
Ploeger, Esther, Poelman, Wessel, Høeg-Petersen, Andreas Holck, Schlichtkrull, Anders, de Lhoneux, Miryam, Bjerva, Johannes
Beyond individual languages, multilingual natural language processing (NLP) research increasingly aims to develop models that perform well across languages generally. However, evaluating these systems on all the world's languages is practically infeasible. To attain generalizability, representative language sampling is essential. Previous work argues that generalizable multilingual evaluation sets should contain languages with diverse typological properties. However, 'typologically diverse' language samples have been found to vary considerably in this regard, and popular sampling methods are flawed and inconsistent. We present a language sampling framework for selecting highly typologically diverse languages given a sampling frame, informed by language typology. We compare sampling methods with a range of metrics and find that our systematic methods consistently retrieve more typologically diverse language selections than previous methods in NLP. Moreover, we provide evidence that this affects generalizability in multilingual model evaluation, emphasizing the importance of diverse language sampling in NLP evaluation.
A Principled Framework for Knowledge-enhanced Large Language Model
Wang, Saizhuo, Liu, Zhihan, Wang, Zhaoran, Guo, Jian
Large Language Models (LLMs) are versatile, yet they often falter in tasks requiring deep and reliable reasoning due to issues like hallucinations, limiting their applicability in critical scenarios. This paper introduces a rigorously designed framework for creating LLMs that effectively anchor knowledge and employ a closed-loop reasoning process, enhancing their capability for in-depth analysis. We dissect the framework to illustrate the contribution of each component to the LLMs' performance, offering a theoretical assurance of improved reasoning under well-defined assumptions.
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency
Liu, Zhihan, Hu, Hao, Zhang, Shenao, Guo, Hongyi, Ke, Shuqi, Liu, Boyi, Wang, Zhaoran
Large language models (LLMs) demonstrate impressive reasoning abilities, but translating reasoning into actions in the real world remains challenging. In particular, it remains unclear how to complete a given task provably within a minimum number of interactions with the external environment, e.g., through an internal mechanism of reasoning. To this end, we propose a principled framework with provable regret guarantees to orchestrate reasoning and acting, which we call "reason for future, act for now" (\texttt{RAFA}). Specifically, we design a prompt template for reasoning that learns from the memory buffer and plans a future trajectory over a long horizon ("reason for future"). At each step, the LLM agent takes the initial action of the planned trajectory ("act for now"), stores the collected feedback in the memory buffer, and reinvokes the reasoning routine to replan the future trajectory from the new state. The key idea is to cast reasoning in LLMs as learning and planning in Bayesian adaptive Markov decision processes (MDPs). Correspondingly, we prompt LLMs to form an updated posterior of the unknown environment from the memory buffer (learning) and generate an optimal trajectory for multiple future steps that maximizes a value function (planning). The learning and planning subroutines are performed in an "in-context" manner to emulate the actor-critic update for MDPs. Our theoretical analysis proves that the novel combination of long-term reasoning and short-term acting achieves a $\sqrt{T}$ regret. In particular, the regret bound highlights an intriguing interplay between the prior knowledge obtained through pretraining and the uncertainty reduction achieved by reasoning and acting. Our empirical validation shows that it outperforms various existing frameworks and achieves nearly perfect scores on a few benchmarks.
Principled Frameworks for Evaluating Ethics in NLP Systems
Prabhumoye, Shrimai, Mayfield, Elijah, Black, Alan W
Those discussions have focused on data collection, experimental design, and interventions in modeling. But we argue that we ought to first understand the frameworks of ethics that are being used to evaluate the fairness and justice of algorithmic systems. Here, we begin that discussion by outlining deontological ethics, and envision a research agenda prioritized by it. Due to the sheer global reach of machine learning and NLP applications, they are empowered to impact societies (Hovy and Spruit, 2016) - potentially for the worse. Potential harms include exclusion of communities due to demographic bias, overgeneralization of model predictions to amplify bias or prejudice, and overstepping privacy concerns in the pursuit of data and quantification (Mieskes, 2017).